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Field of the Invention 

This invention relates to expression vectors containing a DNA 
sequence from the human cytomegalovirus major immediate early gene, 
to host cells containing such vectors, to a method of producing a 
desired polypeptide by using vectors containing said sequence and to 
the use of said DNA sequence. 

Background to t he Invention 

The main aim of workers in the field of recombinant DNA technology 
is to achieve as high a level of production as possible of a 
particular polypeptide. This is particularly true of commercial 
organisations who wish to exploit the use of recombinant DNA 
technology to produce polypeptides which naturally are not very 
abundant. 

Generally the application of DNA technology involves the cloning of 
a gene encoding the desired polypeptide, placing the cloned gene in 
a suitable expression vector, transfecting a host cell line with the 
vector, and culturing the transfected cell line to produce the 
polypeptide. It is almost impossible to predict whether any 
particular vector or cell line or combination thereof will lead to a 
useful level of production. 



In general, the factors which significantly affect the amount of 
polypeptide produced by a transfected cell line are: 1. gene copy 
number, 2. efficiency with which the gene is transcribed' and the 
25 mRNA translated, 3. the stability of the mRNA and 4. the efficiency 

of secretion of the protein. 

The majority of work directed at increasing expression levels of 
recombinant polypeptides has focussed on improving transcription 
initiation mechanisms. As a result the factors affecting efficient 
translation are much less well understood and defined, and generally 
it is not possible to predict whether any particular DNA sequences 
will be of use in obtaining efficient translation. 
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Attempts at investigating translation have consisted largely of 
varying the DNA sequence around the consensus translation start 
signal to determine what effect this has on translation initiation 
(Kozak M. Cell £1 283-292 (1986)). 

5 Studies involving expression of desired heterologous genes normally 

use both the coding sequence and at least part of the 
5 ' -untranslated sequence of the heterologous gene such that 
translation initiation is from the natural sequence of the gene. 
This approach has been found to be unreliable probably as a result 

10 0 f the 'hybrid nature' of the 5 '-untranslated region and the fact 

that the presence of particular 5-untranslated sequences can lead to 
poor initiation of translation (Kozak M. Procl. Natl. Acad. Sci. 83 
2850-2854 (X986) and Pelletier and Sonenberg Cell 40 515-526 
(1985)). This variation in translation has a detrimental effect on 

15 the amount of the product produced. 

Previous studies (Boshart et al Cell 41 521-530 (1985) and Pasleau 
et al, Gene 38, 227-232 (1985); Stenberg et al , J. Virol 49 (1) 
190-199 (1984); Thomsen et al Proc. Natl. Acad. Sci. USA 81 659-663 
(1984) and Foecking and Hofstetter Gene 45, 101-105 (1986)) have used 

20 sequences from the upstream region of the hCHV-MIE gene in 

expression vectors. These have, however, solely been concerned with 
the use of the sequences as promoters and/or enhancers. Spaete and 
Mocarski (J. Virol 56 (1) 135-143, 1985) have used a PstI to PstI 
fragment of the hCMV-HIE gene encompassing the promoter, enhancer 

25 and part of the 5 '-untranslated region, as a promoter for expression 

of heterologous genes. In order to obtain translation the natural 
5 '-untranslated region of the heterologous gene was used. 

In published European Patent Application No. 260148, a method for 
the continuous production of a heterologous protein is described. 
30 The expression vectors constructed contain part of the 

5 '-untranslated region of the hCMV-MIE gene as a stabilising 
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sequence. The stabilising sequence is placed in the 5 '-untranslated 
region of the gene encoding the desired heterologous protein i.e. 
the teaching is again that the natural 5 '-untranslated region of the 
gene is essential for translation. 

5 Summary of the Invention 

In a first aspect the invention provides a vector containing a DNA 
sequence comprising the promoter, enhancer and substantially 
complete 5 '-untranslated region including the first intron of the 
major immediate early gene of human cytomegalovirus. 

10 m a preferred embodiment of the first aspect of the invention, the 

vector includes a restriction site for insertion of a heterologous 
gene. 

The present invention is based on the discovery that vectors 
containing a DNA sequence comprising the promoter, enhancer and 

15 complete 5 '-untranslated region of the major immediate early gene of 

the human cytomegalovirus (hCMV-MIE) upstream of a heterologous gene 
result in high level expression of the heterologous gene product. 
In particular, we have unexpectedly found that when the hCMV-MIE 
derived DNA is linked directly to the coding sequence of the 

20 heterologous gene high levels of mRNA translation are achieved. 

This efficient translation of mRNA is achieved consistently and 
appears to be independent of the particular heterologous gene being 
expressed. 

In a second aspect the invention provides a vector containing a DNA 
25 sequence comprising the promoter, enhancer and substantially 

complete 5 ' -untranslated region including the first intron of the 
major immediate early gene of human cytomegalovirus upstream of a 
heterologous gene. 



30 



The hCKV-HIE derived DNA according to the second aspect of the 
invention may be separated from the coding sequence of the 
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heterologous gene by intervening DNA such as for example by the 
5* -untranslated region of the heterologous gene. Advantageously the 
hCKV-HIE derived DNA may be linked directly to the coding sequence 
of the heterologous gene. 

5 In a preferred embodiment of the second aspect of the invention, the 

invention provides a vector containing a DNA sequence comprising the 
promoter, enhancer and substantially complete 5 '-untranslated region 
including the first intron of the hCMv-MIE gene linked directly to 
the DNA coding sequence of the heterologous gene. 

Preferably the hCMV-MIE derived sequence includes a sequence 
identical to the natural hCMV-MIE translation initiation signal. It 
may however be necessary or convenient to modify the natural 
translation initiation signal to facilitate linking the coding 
sequence of the desired polypeptide to the hCMV-HIE sequence, i.e. 
by introducing a convenient restriction enzyme recognition site. 
For example the translation initiation site may advantageously be 
modified to provide an Ncol recognition site. 

The heterologous gene may be a gene coding for any eukaryotic 
polypeptide such as for example a mammalian polypeptide such as an 
enzyme, e.g. chymosin or gastric lipase; an enzyme inhibitor, e.g. 
tissue inhibitor of metalloproteinase (TIttP) ; a hormone, e.g. growth 
hormone; a lymphokine, e.g. an interferon; a plasminogen activator, 
e.g. tissue plasminogen activator (tPA) or prourokinase; or a 
natural, modified or chimeric immunoglobulin or a fragment thereof 
including chimeric immunoglobulins having dual activity such as 
antibody-enzyme or antibody-toxin chimeras . 

According to a third aspect of the invention there is provided host 
cells transfected with vectors according to the first or second 
aspect of the invention. 
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The host cell may be any eukaryotic cell such as for example plant, 
or insect cells but is preferably a mammalian cell such as for 
example CHO cells or cells of myeloid origin e.g. myeloma or 
hybridoma cells. 

in a fourth aspect the invention provides a process for the 
production of a heterologous polypeptide by culturing a transfected 
cell according to the third aspect of the invention. 

in a fifth aspect the invention provides the use of a DNA sequence 
comprising the promoter, enhancer and substantially complete 
5 '-untranslated region including the first intron of the hCHv-MIE 
gene for expression a heterologous gene. 

in a preferred embodiment of the fifth aspect of the invention the 
hCHv-MIE derived DNA sequence is linked directly to the DNA coding 
sequence of the heterologous gene. 

Also included within the scope of the invention are plasmids pCMGS, 
pHT.l and pEE6hCHV. 



Brief Description of the Drawings 

The present invention is now described, by way of example only, with 
reference to the accompanying drawings in which 



Figure 1: shows a diagrammatic representation: of plasmid pSVLGS.l 

Figure 2: shows a diagrammatic representation of plasmid pHT.l 

Figure 3: shows a diagrammatic representation of plasmid pCMGS 

Figure 4: shows the complete sequence of the promoter-enhancer 

hCMV-MIE including both the first intron and a modified 

25 translation 'start' site 

Figure 5: shows a diagrammatic representation of plasmid P EE6.hCMV 
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Detailed Description of the Embodiments 
Example 1 

The Pst-lm fragment of hCHV (Boshart et al Cell 41 521-530 (1985) 
Spaete & Mocarski J. Virol 56 (1) 135-143 (1985)) contains the 
5 promoter-enhancer and most of the 5' -untranslated leader of the MIE 

gene including the first intron. The remainder of the 
5 'untranslated sequence can be recreated by attaching a small 
additional sequence of approximately 20 base pairs. 

Many eukaryotic genes contain an Ncol restriction site ( 5 * -CCATGG-3 1 ) 
10 overlapping the translation start site, since this sequence 

frequently forms part of a preferred translation initiation signal 
5 , ACCATGPu-3 f . The hCMV-MIE gene does not have an Ncol site at the 
beginning of the protein coding sequence but a single base-pair 
alteration causes- the sequence both to resemble more closely the 
15 "Kozak" concensus initiation signal and introduces an Ncol 

recognition site. Therefore a pair of complementary oligonucleotides 
were synthesised of the sequence: 

GTCACCGTCCTTGACAC 

■ i i iiTi 1 1 1 1 1 1 1 ill 

ACGTCAGTGGCAGGAACTGTGGTAC 
20 which when fused to the Pst-lm fragment of hCMV will recreate the 

complete 5* -untranslated sequence of the MIE gene with the single 
alteration of a G to a C at position -1 relative to the translation 
initiation codon. 

This synthetic DNA fragment was introduced between the hCMV Pst-lm 
25 promoter-enhancer leader fragment and a glutamine synthetase (GS) 

coding sequence by ligation of the Pst-lm fragment and the synthetic 
oligomer with Ncol digested pSV2.GS to generate a new plasmid , pCMGS 
(The production of pSV2-GS is described in published International 
Patent Application No. W0 8704462). pCMGS is shown in Figure 3. 
30 pCMGS thus contains a hybrid transcription unit consisting of the 

following: the synthetic oligomer described above upstream of the 
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hCMv-MIE promoter-enhancer (where it serves merely as a convenient 
Pstl - Ncol "adaptor"), the hCMv-MIE promoter and the complete 5' 
untranslated region of the MIE gene, including the first intron, 
fused directly to the GS coding sequence at the translation 
initiation site. 

pCMGS was introduced into CHO-KI cells by calcium phosphate mediated 
transfection and the plasmid was tested for the ability to confer 
resistance to the GS-inhibitor methionine sulphoximine (MSX) . The 
results of a comparison with pSV2.GS are shown in Table 1. 

It is clear that pCHGS can confer resistance to 20 M MSX at a 
similar frequency to pSV2.GS, demonstrating that active GS enzyme is 
indeed expressed in this vector. 



Table 1 

Results of transfection of GS-expression vectors into CHO-KI cells 

6 

15 Vector no. colonies/10 cells 

resistant to 20uM MSX 

pSV2.GS 32 
pCMGS 17 
Control 0 



20 Example 2 

The TIMP cDNA and SV40 polyadenylation signal as used in pTIMP 1 
Docherty et al (1985) Nature 318, 66-69, was inserted into pEE6 
between the unique Hindlll and BamHI sites to create pEE6TIMP. pEE6 
is a bacterial vector from which sequences inhibitory to replication 

25 in mammalian cells have been removed. It contains the XmnI to Bell 

portion of pCT54 (Emtage et al 1983 Proc . Natl. Acad. Sci. USA 80, 
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3671-3675) with a pSP6A (Melton et al 1984: Nucleic Acids. Res. 12, 
7035) polylinker inserted in between the Hindlll and EcoRI sites. 
The BamH I and Sail sites have been removed from the polylinker by 
digestion, filling in with Klenow enzyme and religation. The Bell 
5 to BamH I fragment is a 237 bp SV40 early polyadenylation signal 

(SV40 2770 to 2533). The BamH I to the Bgll fragment is derived from 
pBR328 (375 to 2422) with an additional deletion between the Sail 
and the Aval sites (651 to 1425) following the addition of a Sail 
linker to the Aval site. The sequence from the Bgll to the XmnI 
10 site originates from the P-lactamase gene of pSP64. 

The. 2129 base-pair Ncol fragment containing the hCHV MIE 
promoter-enhancer and 5' untranslated sequence was isolated from 
pCMGS by partial Ncol digestion and inserted at the Ncol site 
overlapping the translation initiation signal of TIMP in pEE6.TIMP 
15 to generate the plasmid pHT*l (shown in Figure 2). 

A GS gene was introduced into pHT.l to allow selection of permanent 
cell lines by introducing the 5.5K Pvul - BamHI fragment of pSVLGS.l 
(figure 1) at the BamHI site of pHT.l after addition of a synthetic 
BamHI linker to Pvul digested pSVLGS.l to form pHT.lGS. In this 
20 plasmid the hCMV-TIMP and GS transcription units transcribe in the 

same orientation. 

pHT.l GS was introduced into CHO-Kl'cells by calcium-phosphate 
mediated transfection and clones resistant to 20uM MSX were isolated 
2-3 weeks post-transf ection. TIMP secretion rates were determined 

25 by testing culture supernatants in a specific two site ELISA, based 

on a sheep anti TIMP polyclonal antibody as a capture antibody, a 
mouse TIMP monoclonal as the recognition antibody, binding of the 
monoclonal being revealed using a sheep anti mouse IgG peroxidase 
conjugate. Purified natural TIMP was used as a standard for 

30 calibration of the assay and all curves were linear in the range of 

2 - 20ng ml -1 . No non-specific reaction was detectable in CHO-cell 
conditioned culture media. 
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One cell line GS.19 was subsequently recloned, and a sub-clone GS 
19-12 secretes TIMP at a very high level of 3 x 10 

molecules/cell/day. Total genomic DNA extracted from this cell line 
was hybridised with a TIMP probe by Southern blot analysis using 
standard techniques and shown to contain a single intact copy of the 
5 TIMP transcription unit per cell (as well as two re-arranged plasmid 

bands). This cell line was selected for resistance to higher levels 
of MSX and in the first selection a pool of cells resistant to 500yM 
MSX was isolated and recloned . The clone GS-19 . 6(500)14 secretes 
3 x 10 9 molecules TIMP/cell/day . The vector copy-number in this 
10 ce n line is approx. 20 - 30 copies/cell. Subsequent rounds of 

selection for further gene amplification did not led to increased 
TIMP secretion. 

Thus it appears that the hCMV-TIMP transcription unit from pHT.l can 
be very efficiently expressed in CH0-KI cells at approximately a 
15 single copy per cell and a single round of gene amplification leads 

to secretion rates which are maximal using current methods. 

Example 3 

In order to test whether the hCMV-MIE promoter-enhancer-leader can 
be used to direct the efficient expression of other protein 
20 sequences, two different but related plasminogen activator coding 

sequences (designated PA-1 and PA-2) were introduced into CH0-KI 
cells in vectors in which the protein coding sequences were fused 
directly to the hCMV sequence. 

In both these cases, there is no Ncol site at the beginning of the 
25 translated sequence and so synthetic oligonucleotides were used to 

recreate the authentic coding sequence from suitable restriction 
sites within the translated region. The sequence of the modified 
hCMV translation-initiation signal as used in pHT.l was also built 
into the synthetic oligonucleotide which then ended in a Pst-1 
30 restriction site. The Pst-lm fragment of hCMV was then inserted at 

this site to create the complete promoter-enhancer-leader sequence. 
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The hCHV-plasminogen activator transcription units were introduced 
into CHO-KI cells after inserting a GS gene at the unique BamHl site 
as above and MSX resistant cell lines secreting plasminogen 
activator were isolated. 

5 The secretion rates of the best initial transfectant cell lines in 

each case are given in Table 2. From this it is clear that the hCMV 
promoter-enhancer leader can also be used to direct the efficient 
expression of these two plasminogen activator proteins. 



Table 2 

10 Secretion rates of the different plasminogen activator proteins from 

transfectant CHO cell lines. 

* 

Plasminogen activator Molecules secreted/cell/day 

PA — 1 5.5 x 10 7 

PA — 2 1.1 x 10 8 



15 Example 4 

pEE6hCMV was made by ligating the Pst-lm fragment of hCMV, Hindlll - 
digested pEE6 and the complementary oligonucleotides of the sequence 

GTCACCGTCCTTGACACGA 

in in iiiiiUiiihi 

ACGTCAGTGGCAGGAACTGTGCTTCGA 

20 cDNA encoding an immunoglobulin light-chain was inserted at the 

EcoRI site of pEE6.hCMV such that the hCMV-MXE promoter- enhancer 
leader could direct expression of the cDNA and a selectable marker 
gene containing the SV40 origin of replication was inserted at the 
BamHX site of each plasmid. 
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This plasmid was transfected into COS-1 monkey kidney cells by a 
standard DEAE-dextran transfection procedure and transient 
expression was monitored 72 hours post transfection. Light chain 
was secreted into the medium at at least lOOng/ml indicating that 
light chain can indeed be expressed from a transcription unit 
containing the entire hCMV-MIE 5 '-untranslated sequence up to but 
not including the translation initiation ATG, followed by 15 bases 
of natural 5 '-untranslated sequence of the mouse immunoglobulin 
light-chain gene. 
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CLAIMS 

A vector containing a DNA sequence comprising the promoter, 
enhancer and substantially complete 5 •-untranslated region 
including the first intron of the hCMV-MIE gene. 

A vector according to claim 1 wherein the vector includes a 
restriction site for insertion of a heterologous gene* 

A vector containing a DNA sequence comprising the promoter, 
enhancer and substantially complete 5 '-untranslated region 
including the first intron of the hCMV-MIE gene upstream of a 
heterologous gene. 

A vector according to Claim 3 wherein the hCMV-MIE DNA is 
linked directly to the DNA coding sequence of a heterologous 
gene. 

A vector according to Claim 4 wherein the hCMV-MIE DNA includes 
a translation initiation signal. 

A host cell transfected with a vector according to any of the 
preceeding claims. 

A process for the production of a heterologous polypeptide by 
culturing a host cell according to Claim 6. 

The use of a DNA sequence comprising the promoter, enhancer and 
substantially complete 5 '-untranslated region including the 
first intron of the hCMV-MIE gene for expression of a 
heterologous gene . 

The use of a DNA sequence according to Claim S wherein the 
hCMV-MIE derived DNA is linked directly to the coding sequence 
of the heterologous gene. 

Plasmids pCMGS, pEE6.hCMV and pHT.l 
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DNS P 



T 

t AM 

set s h f 1 

* t 3 l u 



II ' 

CCATGGT'GTCAAGGACGGTGACTGCAGTGAATAATAAAATGTGTGTTTGTCCGAAATACG 

i + + + + + + 60 

GGTACCACAGTTCCTGCCACTGACGTCACTTATTATTTTACACACAAACAGGCTTTATGC 

TTTCTGTCGCCGACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCG 



CGTTTTGAGA 
61 + + + 



+ + 120 



GCAAAACTCTAAAGACAGCGGCTGATTTAAGTACAGCGCGCTATCACCACAAATAGCGGC 

C 
1 
a 

ATAGAGATGGCGATATTGGAAAAATCGATATTTGAAAATATGGCATATTGAAAATGTCGC 

121 + * + + + + 

TATCTCTACCGCTATAACCTTTTTAGCTATAAACTTTTATACCGTATAACTTTTACAGCG 

E 
c 
o 
R 
V 

CGATGTGAGTTTCTGTGTAACTGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATAC 
18! + + + + + + 240 



GCTACAC 



TCAAAGACACATTGACTATAGCGGTAAAAAGGTTTTCACTAAAAACCCGTATG 



£ 
c 
o 

R 

GCGATATCTGGCGATAGCGGCTTATATCGTTTACGGGGGATGGCGATAGACGACTTTGGT 

^ m ^m mm mm ^m mm mm m^ mm> mm mm mm ^ ^ ^ ^ mm ^™ mm mm mm mm mm mm J ^ ^ 

241 CGCTATAGACC6CTATCGCCGAATATAGCAAATGCCCCCTACCGCTATCTGCT6AAACCA 
GACTTGGGCGATTCTGTGTGTCGCAAATATCGCAGTTTCGATATAGGTGACAGACGATAT 

„. _^ — — + — — — — H ■ — — — — — h 1" 360 

CTGAACCCGCTAAGACACACAGCGTTTATAGCGTCAAAGCTATATCCACTGTCTGC-TATA 

C BH N C 

f aa si 
r le i a 

1 11 11 
/ 

GAGGCTATATCGCCGATAGAGGCGACATCAAGCTGGCACATGGCCAATGCATATCGATCT 

361 + + + + + + 420 

CTCCGATATAGCGGCTATCTCCGCTGTAGTTCGACCGTGTACCGGTTACGTATAGCTAGA 

Fig. 4A 
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S C BH 
s £ aa 
p r le 
1 1 11 

ATACATTGAATCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCA 

421 + + + + + + 480 

TATGTAACTTAGTTATAACCGGTAATCGGTATAATAAGTAACCAATATATCGTATTTAGT 

S C BH 

s £ aa 

p r le 

1 1 11 

ATATTGGCTATTGGCCATTGCATACGTTGTATCCATATCATAATATGTACATTTATATTG 

481 + + + + + + 540 

TATAACCGATAACCGGTAACGTATGCAACATAGGTATAGTATTATACATGTAAATATAAC 

H 

1 M S 
n m p 
C e e 

2 1 1 
GCTCATGTCCAACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAAT 

541 + + + + + + 600 

CGAGTACAGGTTGTAATGGCGGTACAACTGTAACTAATAACTGATCAATAATTATCATTA 

C AATT AC GGGG T C ATT AGT T C AT AGC C C AT AT ATGG AG TTCCGCGTT AC AT AACTT AC GG 

60i + + + + + + 660 

GTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCC 



B A A 

g ha 
1 at 
x 2 2 

TAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGT 

661 + + + + + + 720 

ATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCA 

A A 
h a 
a t 
2 2 

ATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTAC 

721 + + + + + + 780 

TACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATG 

B N 
g d 

1 e 

1 1 
GGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTG 

781 + + + + + + 840 

CCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAAC 



Fig.4B 
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A A B 
ha 9 
at 1 
2 2 1 

ACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACT 

34X + + + + + + 900 

TGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTGA 

S 

n DNS 

a set 

B aoy 

1 HI 

// 

TTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTT 

901 + + + + + + 560 

AAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAA 

GGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACC 

961 + + + + + + 1020 

GCGTCATGTAGTTACCCG.CACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGG 

A A B 

ha a 

at * n 

2 2 1 
CCATTGACGTCAATGGGAGTTTGTTTXGGCACCAAAATCAAeGGGACTTTCCAAAATGTC 
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